Robust Single-Step Adversarial Training with Regularizer

نویسندگان

چکیده

High cost of training time caused by multi-step adversarial example generation is a major challenge in training. Previous methods try to reduce the computational burden using single-step schemes, which can effectively improve efficiency but also introduce problem “catastrophic overfitting”, where robust accuracy against Fast Gradient Sign Method (FGSM) achieve nearby 100% whereas Projected Descent (PGD) suddenly drops 0% over single epoch. To address this issue, we focus on scheme paper and propose novel with PGD Regularization (FGSMPR) boost without catastrophic overfitting. Our core observation that not simultaneously learn internal representations FGSM examples. Therefore, design regularization term encourage similar embeddings The experiments demonstrate our proposed method train deep network for \(L_{\infty }\)-perturbations gap

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unifying Bilateral Filtering and Adversarial Training for Robust Neural Networks

Recent analysis of deep neural networks has revealed their vulnerability to carefully structured adversarial examples. Many effective algorithms exist to craft these adversarial examples, but performant defenses seem to be far away. In this work, we attempt to combine denoising and robust optimization methods into a unified defense which we found to not only work extremely well, but also makes ...

متن کامل

The Robust Manifold Defense: Adversarial Training using Generative Models

Deep neural networks are demonstrating excellent performance on several classical vision problems. However, these networks are vulnerable to adversarial examples, minutely modified images that induce arbitrary attacker-chosen output from the network. We propose a mechanism to protect against these adversarial inputs based on a generative model of the data. We introduce a pre-processing step tha...

متن کامل

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

Adversarial training (AT)1 is a powerful regularization method for neural networks, aiming to achieve robustness to input perturbations. Yet, the specific effects of the robustness obtained by AT are still unclear in the context of natural language processing. In this paper, we propose and analyze a neural POS tagging model that exploits adversarial training (AT). In our experiments on the Penn...

متن کامل

Robust Deep Reinforcement Learning with Adversarial Attacks

This paper proposes adversarial attacks for Reinforcement Learning (RL) and then improves the robustness of Deep Reinforcement Learning algorithms (DRL) to parameter uncertainties with the help of these attacks. We show that even a naively engineered attack successfully degrades the performance of DRL algorithm. We further improve the attack using gradient information of an engineered loss func...

متن کامل

Synthesizing Robust Adversarial Examples

Neural network-based classifiers parallel or exceed human-level accuracy on many common tasks and are used in practical systems. Yet, neural networks are susceptible to adversarial examples, carefully perturbed inputs that cause networks to misbehave in arbitrarily chosen ways. When generated with standard methods, these examples do not consistently fool a classifier in the physical world due t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-88013-2_3